It could be argued for many NLP applications that a coarse differentiation between the different meanings associated with a single lexical form is adequate to establish a basic interpretation which can guide subsequent processing. This might be the case for some MT systems or NLU systems which only need to establish the general context of discourse. This coarse differentiation might be at the level of homonyms, leaving polysemous words to be associated with an underspecified lexical semantics which is never made fully precise. So what is the homonymy-polysemy distinction and on what grounds is it made? I will show that the distinction is not clear and therefore not a useful basis for deciding what words/senses to include in the lexicon.
A word with (at least) two entirely distinct meanings yet sharing a lexical form is said to be homonymous (e.g. mogul, an emperor, or mogul, a bump on a ski piste), while a word with several related senses is said to be polysemous (e.g. mouth, an organ of the body, the entrance of a cave, etc.) lyons:77. While these definitions are intuitively clear, it has been pointed out many times in the literature on lexical semantics that a clear operational distinction between homonymy and polysemy is lacking. I will review some of the criticisms below, but will begin by introducing an example which emphasises the difficulties of establishing criteria for distinguishing between homonymy and polysemy.
One of the most commonly cited examples of a homonymous word is bank, which has a financial institution sense and a edge of a river sense. These senses seem clearly unrelated, and the fact that they are associated with the same word form seems purely accidental. However, historical linguistics research on Italian has revealed that at some point in the development of the Italian language, these two senses of bank actually coincided by virtue of the fact that bankers (lenders of money) sat on the riverbanks while doing their business. So going to the financial institution meant going to the edge of the river, hence to the bank. Thus a connection between the two modern senses of bank can be established. The relationship between these two senses should presumably not be considered strong enough to establish a relation between them, and therefore to consider bank polysemous rather than homonymous, but what criteria for polysemy do they violate? On what criteria do we decide that senses are related?